Method for run-length encoding of a bitmap data stream

ABSTRACT

Subtitling aims at the presentation of text information and graphical data, encoded as pixel bitmaps. The size of subtitle bitmaps may exceed video frame dimensions, so that only portions are displayed at a time. The bitmaps are a separate layer lying above the video, e.g. for synchronized video subtitles, animations and navigation menus, and therefore contain many transparent pixels. An advanced adaptation for bitmap encoding for HDTV, e.g. 1920×1280 pixels per frame as defined for the Blu-ray Disc Prerecorded format, providing optimized compression results for such subtitling bitmaps, is achieved by a four-stage run length encoding. Shorter or longer sequences of pixels of a preferred color, e.g. transparent, are encoded using the second or third shortest code words, while single pixels of different color are encoded using the shortest code words, and sequences of pixels of equal color use the third or fourth shortest code words.

FIELD OF THE INVENTION

This invention relates to a method for encoding a data stream,particularly a bitmap coded subtitling data stream.

BACKGROUND

Broadcast or read-only media containing video data may also comprisesubpicture data streams, containing textual or graphical informationneeded to provide subtitles, glyphs or animation for any particularpurpose, e.g. menu buttons. Since displaying of such information mayusually be enabled or disabled, it is overlaid on the associated videoimage as an additional layer, and is implemented as one or morerectangular areas called regions. Such region has specified a set ofattributes, like e.g. area size, area position or background color. Dueto the region being overlaid on the video image, its background is oftendefined to be transparent so that the video image can be seen, ormultiple subpicture layers can be overlaid. Further, a subtitle regionmay be broader than the associated image, so that only a portion of thesubtitle region is visible, and the visible portion of the region isshifted e.g. from right to left through the whole subtitle area, whichlooks as if the subtitles would shift through the display. This methodof pixel based subtitling is described in the European Patentapplication EP02025474.4 and is called cropping. Subtitles wereoriginally meant as a support for handicapped people, or to save thecosts for translating a film into rarely used languages, and thereforefor pure subtitle text it would be enough if the subtitle data streamcontained e.g. ASCII coded characters. But subtitles today contain alsoother elements, up to high-resolution images, glyphs or animatedgraphical objects. Handling of such elements is easier if the subtitlingstream is coded in bitmap format, with the lines of an area and thepixels within a line being coded and decoded successively. This formatcontains much redundancy, e.g. when successive pixels have the samecolor value. This redundancy can be reduced by various coding methods,e.g. run-length encoding (RLE). RLE is often used when sequences of datahave the same value, and its basic ideas are to code the sequence lengthand the value separately, and to code the most frequent code words asshort as possible.

Particularly when encoding the subtitle layer for 1920×1280 pixelshigh-definition video (HDTV), a coding algorithm that is optimized forthis purpose is needed to reduce the required amount of data.

SUMMARY OF THE INVENTION

The purpose of the invention is to disclose a method for optimizedencoding of subtitle or subpicture layers for high-resolution video,such as HDTV, being represented as bitmap formatted areas that may bemuch broader than the visible video frame.

According to the invention, four-stage run-length encoding (RLE) is usedfor this purpose, with the shortest code words being used for singlepixels having individual color values other than transparent, the secondshortest code words being used for shorter sequences of transparentpixels, the third shortest code words being used for longer sequences oftransparent pixels and shorter sequences of pixels of equal color otherthan transparent, and the fourth shortest code words being used forlonger sequences of pixels of equal color other than transparent.Usually, most of the pixels within the subtitle layer are transparent.Other than for conventional RLE, where the most frequent data use theshortest code words, this method comprises using the second shortestcode words for short sequences of the most frequent color, and the thirdshortest code words for longer sequences of the most frequent color andalso short sequences of other colors. Shortest code words are reservedfor single pixels of other than the most frequent color. This isadvantageous when pixels of the most frequent color almost always appearin sequences, as being the case for transparent pixels in the subtitlelayer, while single pixels of individual color are more likely to be nottransparent.

Advantageously, a code according to the inventive method incorporatesonly few redundant code words, which are defined to be among the longercode words. E.g. a single pixel of any color other than transparent isideally coded with a code word of the shortest type, but a code word ofthe third shortest type may be used as well, with the sequence lengthbeing one. Though the latter possibility will usually not be used forthis purpose, these unused code words, or gaps in the code word space,can be used for transportation of other information. An example is theend-of-line information that can be used for resynchronization.According to the invention, the shortest redundant code word is used tocode this information.

As another advantage, the disclosed method reduces the amount ofrequired data, thus compressing the subtitle data stream, with thecompression factor depending on the contents of the data stream.Particular high compression factors are achieved for data combinationsthat appear very often in typical subtitling streams. These aresequences of length shorter than e.g. 64 pixels that have the same colorvalue, but also sequences of transparent pixels having any length andsingle pixels having individual color values. The first of these groupsare often used in characters or glyphs, the second of these groups isused before, between and after the displayed elements of the subtitlingstream, and the third of these groups is used in images, or areas withslightly changing color. Since transparent pixels hardly ever appear invery short sequences, e.g. less than three pixels, it is sufficient tocode them not with the shortest but only with the second shortest codewords.

Simultaneously, the inventive method may handle efficiently sequencesthat are longer than 1920 pixels, and e.g. may be up to 16383 pixelslong, thus enabling very wide subtitling areas.

Further, the coding method generates a unique value representing the endof a line, and therefore in the case of loss of synchronization it ispossible to resynchronize each line.

Advantageously, the inventive method is optimized for coding thiscombination of a number of features being typical for subtitlingstreams.

Therefore the amount of data required for the subtitling stream may bereduced, which leads to better utilization of transmission bandwidth inthe case of broadcast, or to a reduced pick-up jump frequency in thecase of storage media where a single pick-up reads multiple datastreams, like e.g. in Blu-ray disc (BD) technology. Further, the betterthe subtitling bitmap is compressed, the higher capacity in terms ofbit-rate will be left for audio and video streams, increasing picture oraudio quality.

Advantageous embodiments of the invention are disclosed in the dependentclaims, the following description and the figures.

BRIEF DESCRIPTION OF THE DRAWINGS

Exemplary embodiments of the invention are described with reference tothe accompanying drawings, which show in

FIG. 1 cropping of a subtitle area in a video frame;

FIG. 2 a pixel sequence in a subtitle area;

FIG. 3 a coding table for subtitling, including text and graphics;

FIG. 4 a table with an exemplary syntax of an extended object datasegment for the Blu-ray Prerecorded standard;

FIG. 5 a flow chart of the encoding method; and

DETAILED DESCRIPTION OF THE INVENTION

While subtitling in pre-produced audio-visual (AV) material forbroadcast or movie discs is primarily optimized for representing simplestatic textual information, e.g. Closed Caption, Teletext orDVB-Subtitle, progress in multimedia development for presentation andanimation of textual and graphic information adequate to new HDTVformats requires an advanced adaptation for bitmap encoding. FIG. 1shows a video frame TV and a subtitle area SUB containing text andgraphical elements G, with the subtitle area SUB being bitmap coded. Thesize of the subtitle area SUB may exceed the video frame dimensions, ase.g. for the Blu-ray Disc Prerecorded (BDP) format subtitle bitmaps areallowed for one dimension to be larger than the video frame. Then thelines are cropped before being displayed, i.e. a portion matching therespective frame dimension is cut out of the virtual line and displayed,overlaying the video image. In FIG. 1, the subtitle area SUB of widthB_(SUB) is cropped, so that only a portion of width B_(TV) is visible.For standard HDTV, as used e.g. for BDP, B_(TV) is 1920 pixels, whileB_(SUB) may be much more.

Due to the rectangular shape of the subtitle area SUB, most pixels inthat area are transparent. This is in an enlarged scale shown in FIG. 2,in a simplified manner since usually a line SL1,SL2 on a HDTV screen TVmust be several pixels wide in order to be clearly visible. A line isherein understood as a horizontal structure. Each line of subtitle datausually contains one or more pixel sequences of equal color. FIG. 2shows a part of a subtitle line SL1 containing transparent sequencesPS1,PS5, but also single visible pixels PS4, shorter visible lines PS2and longer visible lines PS3. Most pixels within a line are transparent.This is the case between characters, but also at the beginning and atthe end of subtitling lines. Anyway, since lines begin and end withtransparent sections, each line contains one more transparent thancolored section. But transparent sections PS1,PS5 are usually longer,while for pixel sequences other than transparent, used e.g. forcharacters, the most frequent case is a sequence length of 64 or less.This can be recognized from a rough estimation, assuming that at least25 characters are displayed simultaneously, and that the space betweencharacters has about one quarter the width of a character, so that asingle character may use not more than 1920/25*(8/10)=62 pixels within aline. Often, a line SL2 contains only very few visible pixels, andtherefore only few transparent sequences that are very long.

A code being a preferred embodiment of the invention is listed in FIG.3. It is a run-length code, comprising code words of lengths rangingfrom 1 byte up to 4 bytes, with 8 bits per byte. It is capable of coding256 different colors, with one preferred color. The preferred color isin this example ‘transparent’, but may be any other color if adequate. Acolor look-up table (CLUT) may transform the decoded color values intothe actual display color. Further, pixel sequences of equal color may becoded in two ranges, with the shorter range being up to 63 pixels andthe longer range being up to 16383 pixels.

The shortest code words of 1 byte length are used to code a single pixelhaving any individual color other than the preferred color, which ishere transparent. The color value CCCCCCCC may range from 1 up to 255,and may represent a color directly or indirectly. E.g. it may representan entry in a color look-up table (CLUT) that contains the actual colorcode. One of the 8-bit values, containing only zeros (00000000), servesas an escape sequence, indicating that the following bits have to beconsidered as part of the same code word. In that case, the code wordtree has four possible branches, marked by the two following bits.

In the first branch, indicated by the following bits being 00, validcode words have two bytes, and a shorter sequence of up to 63 pixels iscoded having the preferred color, e.g. transparent. The only invalidcode word in this branch is the one that comprises only 0's, since 0represents no valid sequence length. This code word ‘00000000 00000000’may be used for other purposes. According to the invention, it is usedto indicate the end of a line since it is the shortest redundant codeword.

In the second branch, indicated by the following bits being 01_(b), thecode word comprises another byte, and the fourteen L bits are used tocode the length of a pixel sequence of the preferred color, e.g.transparent. Thus, the sequence length may be up to 2¹⁴−=16383. The codewords where the L bits have a value below 64 are redundant, and may beused for other purposes.

In the third branch, indicated by the following bits being 10_(b), thecode words comprise an additional byte, and the six L bits of the secondbyte represent the length of a shorter sequence of up to 63 pixels,which have another than the preferred color. The actual color isdirectly or indirectly represented by the CCCCCCCC value of the thirdbyte. The code words with a sequence length LLLLLL below three areredundant, since a sequence of one or two pixels of this color can becheaper coded using one byte per pixel, as described above, and asequence length of zero is invalid. These code words may be used forother purposes. In the fourth branch, indicated by the following bitsbeing 11_(b), the code words comprise two additional bytes, wherein theremaining six bits of the second byte and the third byte give the lengthof a longer sequence of 64 up to 16383 pixels, and the color valueCCCCCCCC of the fourth byte gives the color, directly or indirectly andnot being the preferred color. The code words with a sequence lengthbelow 64 are redundant, since these sequences may be coded cheaper usingthe third branch. These code words may be used for other purposes.

The redundant code words mentioned above may be used to extend the code,e.g. add internal check sums or other information.

The extended run-length encoding table shown in FIG. 3 and describedabove provides mainly two advantages. First, it allows for the mostcompact encoding of typical subtitle streams, including transparentareas, small graphical objects and normal subtitle text. Single pixelsof any color, as used for small colorful graphics, are coded with asingle byte. The dominant color, e.g. transparent for BDP subtitling, isalways encoded together with a run-length. Run-length codes areavailable in two different sizes, or two pixel quantities. In a firststep, run-lengths of up to 63 pixels are available as 2-byte code wordsfor the dominant color, and as 3-byte code words for the other colors.In a second step, run-lengths of up to 16383 pixels are available as3-byte code words for the dominant color, and as 4-byte code words forthe other colors. The end-of-pixel-string code, or end-of-line code, isa unique 2-byte code word that can be used for resynchronization.Secondly, the availability of longer sequences for the subtitling area,up to 16383 pixels per code word, means a reduction of redundancy, andtherefore of the amount of data. This means that for applications withseparate data streams sharing one channel, e.g. multiple data streams onan optical storage medium sharing the same pick-up, bigger portions ofthe subtitling stream may be loaded with the same amount of data, thusreducing the access frequency for the subtitle stream.

Another aspect of the invention is a further optimization of the datastream for transport using transport packets, e.g. in a packetizedelementary stream (PES). Due to the large file size of bitmaps, thepackaging of such data, e.g. in object data segments (ODS), is aproblem. Often the maximum size of an ODS is limited by other factors,e.g. PES packet size. To fit large bitmaps into such packets, it wouldbe necessary to cut bitmaps into small bitmap pieces before coding,which reduces the compression efficiency. To overcome this bitmapsplitting, a new extended object data segment (ExODS) for BDP orcomparable applications is disclosed, as shown in FIG. 4. ExODS is adata structure representing each of the fragments into which an ODS iscut for fitting it into a sequence of limited size segments and PESpackets. The complete ODS can be reconstructed by concatenating thesequence of individual pieces of consecutive ExODSs.

The start and the end of a sequence of ExODS is indicated by separateflags, first_in_sequence and last_in_sequence. When thefirst_in_sequence flag is 1, a new sequence is starting. An ExODS havingset the first_in_sequence flag to 1 also indicates the size of thedecompressed bitmap, by containing its dimension object_width andobject_height. The advantage of indicating bitmap dimension is thesupport of target memory allocation before the decompression starts.Another advantage is, that the indicated bitmap dimensions can also beused during decoding for cross checking bitmap dimensions. When thelast_in_sequence flag is set to 1, the last ExODS of a complete ODS isindicated. There may be ExODS having set neither the first_in_sequencenor the last_in_sequence flag. These are ExODS pieces in the middle of asequence. Also the case of having set both, the first_in_sequence flagand the last_in_sequence flag, is possible if the ODS can be carriedwithin a single ExODS. To overcome the limitation in size available fora single ODS by PES packet size within subtitling, the described type ofExODS may be introduced as a container for pieces of one ODS, e.g. forpackaging large ODS for HDTV application. Besides the ODS pieces, theExODS also carries flags indicating if it is carrying the first piece,the last piece, a middle piece or the one but complete piece of an ExODSsequence. Furthermore, if the first piece in sequence of the ExODS istransmitted, the dimensions of the resulting ODS, i.e. height and widthof the encoded bitmap, is contained in the segment. The indicated bitmapdimensions can also be used for a decoding cross check.

The inventive method can be used for compression of bitmap data streamscontaining e.g. text, images or graphics data for animation, menus,navigation, logos, advertisement, messaging or others, in applicationssuch as e.g. Blu-Ray Prerecorded (BDP) discs or generallyhigh-definition video (HDTV) recordings or broadcast.

The invention discloses a method for run-length encoding of a datastream comprising bitmap formatted subtitle or menu data for videopresentation on a display, wherein the subtitle or menu data includegraphics or text or both, as shown in FIG. 5. The method comprises thesteps of defining a preferred color 510, and defining a range ofrun-lengths 520. Pixels of the preferred color are encoded to first codewords with two or three bytes, wherein the first code words comprise arun-length value 530 and 540-547. The run-length value comprised infirst code words having three bytes exceeds the defined range and mayexceed the width of the display 547. Pixels of another than thepreferred color are encoded to second code words with one, three or fourbytes 550-567, wherein the second code words comprise a color value andsecond code words having three or four bytes comprise a run-lengthvalue. The run-length value comprised in second code words having fourbytes exceeds the defined range and may exceed the width of the display565.

A method for run-length decoding of an encoded data stream for a videopresentation on a display is described. The method comprises determiningthe first byte of a code word. If the first byte does not have a definedfirst value, the first byte is decoded to a single pixel having itscolor defined by the value of the first byte, the color being other thana defined first color. If the first byte has the defined first value,the method determines the first and second bits of the following byte(the second byte). If the first and second bits of the second byte havea first value, the remaining bits of the second byte are decoded to asequence of pixels of the defined first color, wherein the remainingbits of the second byte define the sequence length. If the first andsecond bits of the second byte have a second value, the remaining bitsof the second byte together with the following third byte are decoded toa sequence of pixels of the defined first color, wherein the remainingbits of the second byte and the third byte define the sequence length,and the sequence length may exceed the display width. If the first andsecond bit of the second byte have a third value, the remaining bits ofthe second byte together with the third byte are decoded to a sequenceof pixels of a another color. The remaining bits of the second bytedefine the sequence length and the third byte defines the pixels color.If the first and second bit of the second byte have a fourth value, theremaining bits of the second byte together with the third and afollowing fourth byte are decoded, wherein the remaining bits of thesecond byte and the third byte define the sequence length and the fourthbyte defines the pixel color, and the sequence length may exceed thedisplay width value.

1. A method for run-length encoding of a data stream, the data stream comprising bitmap formatted subtitle or menu data for video presentation on a display, wherein the subtitle or menu data include graphics or text or both, comprising the steps of defining a preferred color; defining a range of run-lengths; encoding pixels of the preferred color to first code words with two or three bytes, wherein said first code words comprise a run-length value, and wherein the run-length value comprised in first code words having three bytes exceeds said defined range and may exceed the width of the display; encoding pixels of another than the preferred color to second code words with one, three or four bytes, wherein the second code words comprise a color value, and wherein second code words having three or four bytes comprise a run-length value, and wherein the run-length value comprised in second code words having four bytes exceeds said defined range and may exceed the width of the display.
 2. Method according to claim 1, wherein said color values and the preferred color are mapped with a look-up table to display colors.
 3. Method according to claim 1, wherein the shortest redundant code word is used for line synchronization.
 4. Method for run-length decoding of an encoded data stream for a video presentation on a display, comprising the steps of determining the first byte of a code word; if said first byte has not a defined first value, decoding said first byte to a single pixel having individual color defined by the value of said first byte, the color being other than a defined first color; if said first byte has the defined first value, determining the first and second bit of the following byte being the second byte; if the first and second bit of the second byte have a first value, decoding the remaining bits of the second byte to a sequence of pixels of the defined first color, wherein said remaining bits of the second byte define the sequence length; if the first and second bit of the second byte have a second value, decoding said remaining bits of the second byte together with the following third byte to a sequence of pixels of the defined first color, wherein said remaining bits of the second byte and said third byte define the sequence length, and wherein said sequence length may exceed the display width; if the first and second bit of the second byte have a third value, decoding said remaining bits of the second byte together with the third byte to a sequence of pixels, wherein said remaining bits of the second byte define the sequence length and the third byte defines the pixels color; and if the first and second bit of the second byte have a fourth value, decoding said remaining bits of the second byte together with the third and a following fourth byte, wherein said remaining bits of the second byte and the third byte define the sequence length and the fourth byte defines the pixel color, and wherein said sequence length may exceed the display width.
 5. Method according to claim 4, wherein said defining of a pixel color from the first, third or fourth byte and from said first value comprises using a look-up table.
 6. Method according to claim 4, wherein the encoded data stream for a video presentation is a separate layer overlaying other video data on the display, further comprising the step of selecting a portion of said separate layer for displaying. 